URBAN analysis – FDA
Functional models for HR and EDA – results
This document provides a brief overview of the functional linear models for Heart Rate (HR) and Electrodermal Activity (EDA) fitted as part of the FDA analysis in the URBAN study.
The first section introduces the analysis, the functional models, and the results. We explain the structure of the outputs, the interpretation of the graphical outputs, and the numerical characteristics of the models.
The second section focuses on the fitted models for HR and EDA. We highlight a subset of models that appear most suitable for capturing the functional responses. For these selected models, both numerical summaries and graphical outputs are provided.
This document is intended as a concise summary of the selected models. For a detailed description of the full analysis procedure, including
- data preprocessing,
- registration,
- smoothing, and
- model fitting,
please refer to Z:/GAMU-STRECITY/04_Codes/FDA/CODE/URBAN_analysis.ipynb or its corresponding HTML version.
1 Introduction to functional models
To model the functional responses (HR, EDA) based on other functional covariates (noise, altitude, speed, pollution, etc.), we use the concurrent functional linear model.
This model relates the value of the outcome \(y\) at time \(t\) (i.e., \(y_i(t)\)) to the values of covariates \(z_j\) observed at the same time \(t\) (i.e., \(z_{ij}(t)\)). Formally, it can be written as \[ y_i(t) = \beta_0(t) + \sum_{j=1}^K \beta_j(t) z_{ij}(t) + \varepsilon_i(t), \tag{1}\] where \(K\) is the number of covariates, and \(\varepsilon_i(t)\) denotes a random error term for subject \(i = 1, 2, \dots, n\).
All models presented in Section 2 and Section 3 follow the general form in Equation 1. After preprocessing the raw data into functional form (including registration and smoothing), the models were fitted using the fRegress() function from the fda package.
1.1 Example model
To illustrate the structure of the results and to explain the graphical and numerical outputs of a functional linear model, consider the following simple example:
\[ \text{Heart rate}_i(t) = \beta_0(t) + \beta_1(t) \cdot \text{Altitude}_i(t) + \beta_2(t) \cdot \text{Noise}_i(t) + \varepsilon_i(t), \quad i = 1, \dots, 39. \tag{2}\]
In this model for HR, two functional covariates are included: Altitude and Noise. The dataset consists of 39 observations, corresponding to the participants with valid recordings for all variables used in the model.
This example will serve to demonstrate how to think of the model outputs, including both the graphical displays and the numerical characteristics.
1.2 Observed vs. estimated curves
The first output of the model is a plot of the observed trajectories of the outcome variable (here, Heart Rate) together with the trajectories estimated by the fitted functional model. Estimated curves are shown as dashed lines, while observed curves are shown as solid lines. Even with this simple model, we see that for the vast majority of participants the observed and estimated trajectories align well.
1.3 Functional \(R^2\)
One useful characteristic of the fitted functional linear model is the functional \(R^2\), an analogue of the classical coefficient of determination \(R^2\) in the setting of functional data analysis. As in the classical case, it quantifies the proportion of total variability in the response curves that is explained by the model.
Unlike the scalar \(R^2\), the functional \(R^2\) varies over time and can therefore be visualized as a curve. To facilitate comparisons between models, we also summarize it with numerical descriptors:
- Maximum \(R^2\) (
maxR2): the maximum value of functional \(R^2\) over the observed interval (higher is better), - Integrated \(R^2\) (
integratedR2): the area under the functional \(R^2\) curve (higher is better, although functional \(R^2\) may occasionally take negative values on some subintervals).
We also compute a third numerical characteristic of the model, the Root Mean Square Error (\(RMSE\)), which summarizes the quality of the model fit as the square root of the average of the mean squared error curve, calculated from the differences between the observed and estimated curves.
| RMSE | maxR2 | integratedR2 | |
|---|---|---|---|
| model characteristics | 8.736482 | 0.4111378 | 6.871452 |
1.4 Functional \(F\)-test
The functional permutation \(F\)-test is a way to check if a functional predictors really matters in a functional linear regression model (a test of no effect in functional linear regression). It compares the fit of the model to what would happen if the data were shuffled randomly, giving a sense of whether the observed patterns are meaningful or just due to chance.
Similarly to the functional \(R^2\), the \(F\)-statistic from the functional permutation \(F\)-test varies over time and can therefore be visualized as a curve. The outputs of the permutation \(F\)-test are:
-
\(F_{obs}\) (
Fobs): the observed maximal \(F\)-statistic (the maximal value of the pointwise \(F\)-statistic). -
\(q\)-quantile (
qval): the \(q\)th quantile of the null distribution to compare to the observed \(F\)-statistic (\(q = 0.95\)). -
\(p\)-value (
pval): the observed \(p\)-value of the permutation test.
| pval | qval | Fobs | |
|---|---|---|---|
| permutation \(F\)-test outputs | 0.01 | 0.4733055 | 0.6124866 |
1.5 Regression parameters
All previous graphical and numerical outputs of the functional model 2 describe the quality of the model fit considering all covariates together.
To understand the effect of each functional covariate separately on the outcome (HR), we now focus on the estimated functional regression parameters \(\widehat{\beta_0}(t)\) and \(\widehat{\beta_j}(t), j = 1, 2\). Additionally, to assess the significance of the covariates both locally and globally, we calculate and visualize:
- Pointwise bootstrap confidence intervals for the regression parameters \({\beta_0}(t)\) and \({\beta_j}(t), j = 1, 2\) (local, pointwise significance).
-
Global envelope: using the
GETpackage, we construct a \(95\%\) global envelope for the test statistic of \(H_0: \beta_j(t) = 0, j = 1, 2\).
The global envelope test evaluates the significance of a covariate globally across the entire time interval. It provides a nonparametric \(p\)-value, indicating whether the covariate has a significant effect on the outcome.
Importantly, the test is able to find not only if the factor of interest is significant, but also which functional domain is responsible for the potential rejection.
To inspect this visually, we plot the global envelope using the regression coefficients as test functions. If the \(p\)-value of the global envelope test is significant, at least one time point exists where the regression coefficient \(\widehat{\beta}_j(t)\) exits the global envelope, revealing the domain responsible for the effect. In these intervals, the covariate has a significant impact on the outcome variable.
Regression parameter \(\beta_0\)
The functional intercept \(\widehat{\beta_0}(t)\) represents the mean heart rate over time when all covariates are zero (the baseline curve at the reference level of the covariates). Any deviations from \(\beta_0(t)\) in the observed heart rate curves are captured by the effects of \(\beta_1(t)\) and \(\beta_2(t)\), which show how altitude and noise influence heart rate at each time point.
Regression parameter \(\beta_j\)
For the regression parameters \(\beta_1\) and \(\beta_2\), we have both bootstrap confidence intervals and global envelopes. In the plots below, we can interpret the elements as follows:
- Blue solid line: the fitted functional regression parameter from our model 2. This curve is used to estimate the heart rate curves shown in Figure 1.
- Bright grey lines: pointwise bootstrap confidence intervals for the regression parameter (blue line). We focus on the subintervals where the CI does not cover zero.
- Black solid line: estimated test function from the GET approach. It should mimic the behavior of the blue line from our model.
- Grey area: global envelope corresponding to the global envelope test. We focus on the subintervals where the test function exits the global envelope (highlighted with red points).
Additionally, in the upper left corner of each plot, the \(p\)-value indicates the overall significance of the covariate of interest.
2 Models for HR
In this section, we analyze the functional models for heart rate. From the significant models (those with permutation \(F\)-test \(p\)-value \({}< 0.05\)), we identify the most suitable candidates. The following section presents the results of these selected models.
We begin by listing all fitted models (All considered models). More complex models are not included, as they are computationally demanding and in some cases fail due to singularities. For each model, we report key numerical characteristics: the number of curves (\(N\)), \(RMSE\), functional \(R^2\) descriptors, and the \(F\)-test result. The models are ordered by decreasing maximum \(R^2\).
In the Model formula column, covariates are color-coded according to their significance assessed by the global envelope test – red for \(p < 0.05\), blue for \(p \in [0.05, 0.1)\).
Next, we filter only those models significant under the functional permutation \(F\)-test (Significant models). From this subset, we select a few final models that we consider most appropriate for describing HR dynamics over time (Final selected models). The main selection criterion is the maximum \(R^2\) – higher values indicate better fit, though caution is needed when comparing models based on different sample sizes (\(N\)).
All functional models include either air pressure or altitude, and most also incorporate an activity-related predictor such as acceleration, speed, or steps.
| Model formula | \(N\) | \(RMSE\) | Maximum \(R^2\) | Integrated \(R^2\) | \(p\)-value | |
|---|---|---|---|---|---|---|
| 16 | HR \(\sim\) altitude + noise + steps + PM1 | 19 | 7.596198 | 0.886322 | 15.440500 | 0.000 |
| 17 | HR \(\sim\) altitude + noise + steps + PM25 | 19 | 7.595585 | 0.882461 | 15.194607 | 0.000 |
| 14 | HR \(\sim\) altitude + noise + speed + PM25 | 23 | 7.246352 | 0.850977 | 16.165170 | 0.000 |
| 13 | HR \(\sim\) altitude + noise + speed + PM1 | 23 | 7.229381 | 0.846894 | 16.343329 | 0.000 |
| 12 | HR \(\sim\) altitude + noise + speed | 39 | 8.373718 | 0.760873 | 12.628704 | 0.000 |
| 21 | HR \(\sim\) altitude + noise + temperature + speed | 33 | 8.263548 | 0.757306 | 14.611970 | 0.000 |
| 22 | HR \(\sim\) altitude + noise + temperature + steps | 32 | 8.262888 | 0.736877 | 13.816507 | 0.000 |
| 15 | HR \(\sim\) altitude + noise + steps | 32 | 8.555006 | 0.718060 | 11.376717 | 0.000 |
| 24 | HR \(\sim\) altitude + noise + body_temperature + PM25 | 17 | 7.254627 | 0.715413 | 15.033106 | 0.180 |
| 30 | HR \(\sim\) altitude + steps + Sex | 39 | 8.308018 | 0.710620 | 11.956099 | 0.000 |
| 28 | HR \(\sim\) altitude + speed + PM25 | 31 | 7.603597 | 0.707587 | 12.763065 | 0.005 |
| 27 | HR \(\sim\) altitude + speed + PM1 | 31 | 7.608386 | 0.707431 | 12.732527 | 0.005 |
| 23 | HR \(\sim\) altitude + noise + body_temperature + PM1 | 17 | 7.204590 | 0.705722 | 15.502226 | 0.195 |
| 29 | HR \(\sim\) altitude + speed + Sex | 47 | 8.265521 | 0.680346 | 12.408756 | 0.000 |
| 6 | HR \(\sim\) altitude + noise + acceleration + PM25 | 19 | 7.557765 | 0.667754 | 15.215718 | 0.375 |
| 26 | HR \(\sim\) altitude + noise + temperature (Entrant) | 24 | 8.332604 | 0.656385 | 15.753599 | 0.040 |
| 5 | HR \(\sim\) altitude + noise + acceleration + PM1 | 19 | 7.531307 | 0.651828 | 15.359033 | 0.465 |
| 18 | HR \(\sim\) altitude + noise + temperature + acceleration | 32 | 8.334506 | 0.557506 | 12.402946 | 0.145 |
| 7 | HR \(\sim\) altitude + noise + acceleration + Sex | 31 | 8.668593 | 0.554813 | 12.215680 | 0.195 |
| 4 | HR \(\sim\) altitude + noise + acceleration | 32 | 8.588232 | 0.539411 | 9.702973 | 0.115 |
| 20 | HR \(\sim\) altitude + noise + body_temperature | 28 | 8.399143 | 0.533202 | 11.206135 | 0.045 |
| 31 | HR \(\sim\) altitude + body_temperature | 34 | 8.464485 | 0.478750 | 8.416988 | 0.005 |
| 9 | HR \(\sim\) altitude + noise + PM1 | 23 | 7.666164 | 0.450873 | 9.603953 | 0.625 |
| 10 | HR \(\sim\) altitude + noise + PM25 | 23 | 7.685670 | 0.449567 | 9.376699 | 0.645 |
| 3 | HR \(\sim\) air_pressure + noise + temperature | 25 | 8.724708 | 0.440012 | 9.678972 | 0.210 |
| 25 | HR \(\sim\) altitude + noise + temperature (Empatica) | 33 | 8.567088 | 0.429390 | 9.358159 | 0.085 |
| 8 | HR \(\sim\) altitude + noise | 39 | 8.736482 | 0.411138 | 6.871452 | 0.010 |
| 2 | HR \(\sim\) air_pressure + noise | 30 | 8.784591 | 0.401951 | 6.733928 | 0.040 |
| 33 | HR \(\sim\) altitude + temperature + PM25 | 27 | 7.684850 | 0.393268 | 9.166168 | 0.430 |
| 32 | HR \(\sim\) altitude + temperature + PM1 | 27 | 7.685915 | 0.393187 | 9.208156 | 0.445 |
| 11 | HR \(\sim\) altitude + noise + Sex | 38 | 8.518559 | 0.379981 | 9.999879 | 0.035 |
Final selected models are highlighted in blue.
| Model formula | \(N\) | \(RMSE\) | Maximum \(R^2\) | Integrated \(R^2\) | \(p\)-value | |
|---|---|---|---|---|---|---|
| 16 | HR \(\sim\) altitude + noise + steps + PM1 | 19 | 7.596198 | 0.886322 | 15.440500 | 0.000 |
| 17 | HR \(\sim\) altitude + noise + steps + PM25 | 19 | 7.595585 | 0.882461 | 15.194607 | 0.000 |
| 14 | HR \(\sim\) altitude + noise + speed + PM25 | 23 | 7.246352 | 0.850977 | 16.165170 | 0.000 |
| 13 | HR \(\sim\) altitude + noise + speed + PM1 | 23 | 7.229381 | 0.846894 | 16.343329 | 0.000 |
| 12 | HR \(\sim\) altitude + noise + speed | 39 | 8.373718 | 0.760873 | 12.628704 | 0.000 |
| 21 | HR \(\sim\) altitude + noise + temperature + speed | 33 | 8.263548 | 0.757306 | 14.611970 | 0.000 |
| 22 | HR \(\sim\) altitude + noise + temperature + steps | 32 | 8.262888 | 0.736877 | 13.816507 | 0.000 |
| 15 | HR \(\sim\) altitude + noise + steps | 32 | 8.555006 | 0.718060 | 11.376717 | 0.000 |
| 30 | HR \(\sim\) altitude + steps + Sex | 39 | 8.308018 | 0.710620 | 11.956099 | 0.000 |
| 28 | HR \(\sim\) altitude + speed + PM25 | 31 | 7.603597 | 0.707587 | 12.763065 | 0.005 |
| 27 | HR \(\sim\) altitude + speed + PM1 | 31 | 7.608386 | 0.707431 | 12.732527 | 0.005 |
| 29 | HR \(\sim\) altitude + speed + Sex | 47 | 8.265521 | 0.680346 | 12.408756 | 0.000 |
| 26 | HR \(\sim\) altitude + noise + temperature (Entrant) | 24 | 8.332604 | 0.656385 | 15.753599 | 0.040 |
| 20 | HR \(\sim\) altitude + noise + body_temperature | 28 | 8.399143 | 0.533202 | 11.206135 | 0.045 |
| 31 | HR \(\sim\) altitude + body_temperature | 34 | 8.464485 | 0.478750 | 8.416988 | 0.005 |
| 8 | HR \(\sim\) altitude + noise | 39 | 8.736482 | 0.411138 | 6.871452 | 0.010 |
| 2 | HR \(\sim\) air_pressure + noise | 30 | 8.784591 | 0.401951 | 6.733928 | 0.040 |
| 11 | HR \(\sim\) altitude + noise + Sex | 38 | 8.518559 | 0.379981 | 9.999879 | 0.035 |
| Model formula | \(N\) | \(RMSE\) | Maximum \(R^2\) | Integrated \(R^2\) | \(p\)-value | |
|---|---|---|---|---|---|---|
| 17 | HR \(\sim\) altitude + noise + steps + PM25 | 19 | 7.595585 | 0.882461 | 15.194607 | 0.000 |
| 12 | HR \(\sim\) altitude + noise + speed | 39 | 8.373718 | 0.760873 | 12.628704 | 0.000 |
| 20 | HR \(\sim\) altitude + noise + body_temperature | 28 | 8.399143 | 0.533202 | 11.206135 | 0.045 |
| 8 | HR \(\sim\) altitude + noise | 39 | 8.736482 | 0.411138 | 6.871452 | 0.010 |
We selected the following models:
- HR \(\sim\) altitude + noise + steps + PM25 – this model achieves the highest maximum \(R^2\) among all candidates. Including PM25 is more relevant for short-term dependence then PM1, as it reflects acute effect. The model shows strong explanatory power, with both high integrated \(R^2\) and low \(RMSE\), and is overall highly significant.
- HR \(\sim\) altitude + noise – serves as a baseline model with only two predictors, yet still statistically significant. It is easy to interpret and provides a useful benchmark for evaluating the added value of more complex models. While not achieving the top \(R^2\), it performs reasonably well given its simplicity.
- HR \(\sim\) altitude + noise + body_temperature – an extension of the baseline model that introduces body temperature as an additional predictor. This variable shows partial significance.
- HR \(\sim\) altitude + noise + speed – another extension of the baseline model, where speed is added as a highly significant covariate. This model combines high maximum \(R^2\) with a large sample size (\(N\)), matching that of the baseline model.
2.1 HR \(\sim\) alt + noise + steps + PM25
The formula for this functional linear model is of the form:
\[ \text{Heart rate}_i(t) = \beta_0(t) + \beta_1(t) \cdot \text{Altitude}_i(t) + \beta_2(t) \cdot \text{Noise}_i(t) + \beta_3(t) \cdot \text{steps}_i(t) + \beta_4(t) \cdot \text{PM25}_i(t) + \varepsilon_i(t). \]
Numerical characteristics for the model:
| \(RMSE\) | Maximum \(R^2\) | Integrated \(R^2\) | \(p\)-value | |
|---|---|---|---|---|
| model characteristics | 7.595585 | 0.8824614 | 15.19461 | 0 |
Observed and estimated outcome trajectories, graph of the functional \(R^2\) and \(F\)-statistic:
Finally, graphs of the estimated functional regression parameters \(\widehat{\beta_j}(t), j = 1, 2, 3, 4\) for intercept and covariates included in the model, with bootstrap CI and global envelope.
2.2 HR \(\sim\) alt + noise
The formula for this functional linear model is of the form:
\[ \text{Heart rate}_i(t) = \beta_0(t) + \beta_1(t) \cdot \text{Altitude}_i(t) + \beta_2(t) \cdot \text{Noise}_i(t) + \varepsilon_i(t). \]
Numerical characteristics for the model:
| \(RMSE\) | Maximum \(R^2\) | Integrated \(R^2\) | \(p\)-value | |
|---|---|---|---|---|
| model characteristics | 8.736482 | 0.4111378 | 6.871452 | 0.01 |
Observed and estimated outcome trajectories, graph of the functional \(R^2\) and \(F\)-statistic:
Finally, graphs of the estimated functional regression parameters \(\widehat{\beta_j}(t), j = 1, 2\) for intercept and covariates included in the model, with bootstrap CI and global envelope.
2.3 HR \(\sim\) alt + noise + body_temp
The formula for this functional linear model is of the form:
\[ \text{Heart rate}_i(t) = \beta_0(t) + \beta_1(t) \cdot \text{Altitude}_i(t) + \beta_2(t) \cdot \text{Noise}_i(t) + \beta_3(t) \cdot \text{body\_temperature}_i(t) + \varepsilon_i(t). \]
Numerical characteristics for the model:
| \(RMSE\) | Maximum \(R^2\) | Integrated \(R^2\) | \(p\)-value | |
|---|---|---|---|---|
| model characteristics | 8.399143 | 0.5332019 | 11.20614 | 0.045 |
Observed and estimated outcome trajectories, graph of the functional \(R^2\) and \(F\)-statistic:
Finally, graphs of the estimated functional regression parameters \(\widehat{\beta_j}(t), j = 1, 2, 3\) for intercept and covariates included in the model, with bootstrap CI and global envelope.
2.4 HR \(\sim\) alt + noise + speed
The formula for this functional linear model is of the form:
\[ \text{Heart rate}_i(t) = \beta_0(t) + \beta_1(t) \cdot \text{Altitude}_i(t) + \beta_2(t) \cdot \text{Noise}_i(t) + \beta_3(t) \cdot \text{speed}_i(t) + \varepsilon_i(t). \]
Numerical characteristics for the model:
| \(RMSE\) | Maximum \(R^2\) | Integrated \(R^2\) | \(p\)-value | |
|---|---|---|---|---|
| model characteristics | 8.373718 | 0.760873 | 12.6287 | 0 |
Observed and estimated outcome trajectories, graph of the functional \(R^2\) and \(F\)-statistic:
Finally, graphs of the estimated functional regression parameters \(\widehat{\beta_j}(t), j = 1, 2, 3\) for intercept and covariates included in the model, with bootstrap CI and global envelope.
3 Models for EDA
In this section, we analyze the functional models for electrodermal activity. Similarly to the previous Section 2, we identify the most suitable candidates from the significant models (those with permutation \(F\)-test \(p\)-value \({}< 0.05\)). The following section presents the results of these selected models.
As in Section 2, we begin by listing all fitted models (All considered models). More complex models are not included, as they are computationally demanding and in some cases fail due to singularities. For each model, we report key numerical characteristics: the number of curves (\(N\)), \(RMSE\), functional \(R^2\) descriptors, and the \(F\)-test result. The models are ordered by decreasing maximum \(R^2\).
In the Model formula column, covariates are color-coded according to their significance assessed by the global envelope test – red for \(p < 0.05\), blue for \(p \in [0.05, 0.1)\).
Next, we filter only those models significant under the functional permutation \(F\)-test (Significant models). From this subset, we select a few final models that we consider most appropriate for describing EDA dynamics over time (Final selected models). There are only two significant functional models and we select both of them.
Almost all functional models include either air pressure or altitude, and most also incorporate an activity-related predictor such as acceleration, speed, or steps. In contrast to the HR functional models, temperature-related variables – air temperature or body temperature – are also frequently included.
| Model formula | \(N\) | \(RMSE\) | Maximum \(R^2\) | Integrated \(R^2\) | \(p\)-value | |
|---|---|---|---|---|---|---|
| 22 | EDA \(\sim\) altitude + noise + body_temperature + PM1 | 14 | 0.945528 | 0.688733 | 21.958199 | 0.555 |
| 23 | EDA \(\sim\) altitude + noise + body_temperature + PM25 | 14 | 0.947110 | 0.677125 | 22.525824 | 0.425 |
| 6 | EDA \(\sim\) altitude + noise + acceleration + PM25 | 18 | 1.010421 | 0.660726 | 15.608628 | 0.200 |
| 5 | EDA \(\sim\) altitude + noise + acceleration + PM1 | 18 | 1.009266 | 0.649744 | 15.440150 | 0.245 |
| 25 | EDA \(\sim\) altitude + noise + temperature (Entrant) | 19 | 0.971104 | 0.634487 | 14.239710 | 0.095 |
| 12 | EDA \(\sim\) altitude + noise + speed + PM1 | 19 | 0.990957 | 0.582579 | 15.192438 | 0.215 |
| 16 | EDA \(\sim\) altitude + noise + steps + PM25 | 18 | 1.096776 | 0.575932 | 40.109877 | 0.480 |
| 15 | EDA \(\sim\) altitude + noise + steps + PM1 | 18 | 1.097739 | 0.575689 | 41.578101 | 0.350 |
| 13 | EDA \(\sim\) altitude + noise + speed + PM25 | 19 | 0.992560 | 0.574583 | 15.338634 | 0.190 |
| 9 | EDA \(\sim\) altitude + noise + PM25 | 19 | 1.020716 | 0.555826 | 12.185977 | 0.410 |
| 8 | EDA \(\sim\) altitude + noise + PM1 | 19 | 1.018820 | 0.547578 | 11.974803 | 0.415 |
| 27 | EDA \(\sim\) altitude + temperature + PM1 | 26 | 1.038146 | 0.546560 | 13.712147 | 0.065 |
| 19 | EDA \(\sim\) altitude + noise + temperature + body_temperature | 23 | 0.983598 | 0.546079 | 16.127379 | 0.170 |
| 29 | EDA \(\sim\) noise + temperature + body_temperature | 23 | 0.999761 | 0.545080 | 14.498624 | 0.040 |
| 17 | EDA \(\sim\) altitude + noise + temperature + acceleration | 31 | 0.959756 | 0.533264 | 14.448738 | 0.090 |
| 28 | EDA \(\sim\) altitude + temperature + PM25 | 26 | 1.043578 | 0.512391 | 13.000250 | 0.095 |
| 24 | EDA \(\sim\) altitude + noise + temperature | 32 | 0.980686 | 0.502600 | 11.136726 | 0.010 |
| 20 | EDA \(\sim\) altitude + noise + temperature + speed | 32 | 0.978453 | 0.498659 | 25.745134 | 0.765 |
| 21 | EDA \(\sim\) altitude + noise + temperature + steps | 31 | 0.967796 | 0.484247 | 13.279872 | 0.260 |
| 18 | EDA \(\sim\) altitude + noise + body_temperature | 23 | 1.015597 | 0.456796 | 13.112231 | 0.285 |
| 4 | EDA \(\sim\) altitude + noise + acceleration | 31 | 1.019642 | 0.441674 | 8.819600 | 0.235 |
| 3 | EDA \(\sim\) air_pressure + noise + temperature | 24 | 1.018808 | 0.412089 | 10.848122 | 0.540 |
| 11 | EDA \(\sim\) altitude + noise + speed | 32 | 1.008945 | 0.399442 | 10.770685 | 0.765 |
| 10 | EDA \(\sim\) altitude + noise + Sex | 31 | 1.021243 | 0.360871 | 8.206034 | 0.330 |
| 7 | EDA \(\sim\) altitude + noise | 32 | 1.026654 | 0.356377 | 6.133042 | 0.265 |
| 14 | EDA \(\sim\) altitude + noise + steps | 31 | 1.023609 | 0.343626 | 7.898181 | 0.525 |
| 26 | EDA \(\sim\) altitude + body_temperature | 28 | 1.017492 | 0.337989 | 6.511186 | 0.190 |
| 2 | EDA \(\sim\) air_pressure + noise | 24 | 1.053468 | 0.337832 | 6.318836 | 0.790 |
Note that models in the first half of the table above having high maximum \(R^2\) are based on a small number of train curves (\(N = 14, 18, 19\)). In the contrary, the second half of this table contains only the models with \(N>20\).
Final selected models are highlighted in blue.
| Model formula | \(N\) | \(RMSE\) | Maximum \(R^2\) | Integrated \(R^2\) | \(p\)-value | |
|---|---|---|---|---|---|---|
| 24 | EDA \(\sim\) altitude + noise + temperature | 32 | 0.980686 | 0.50260 | 11.13673 | 0.01 |
| 29 | EDA \(\sim\) noise + temperature + body_temperature | 23 | 0.999761 | 0.54508 | 14.49862 | 0.04 |
| Model formula | \(N\) | \(RMSE\) | Maximum \(R^2\) | Integrated \(R^2\) | \(p\)-value | |
|---|---|---|---|---|---|---|
| 24 | EDA \(\sim\) altitude + noise + temperature | 32 | 0.980686 | 0.50260 | 11.13673 | 0.01 |
| 29 | EDA \(\sim\) noise + temperature + body_temperature | 23 | 0.999761 | 0.54508 | 14.49862 | 0.04 |
We selected the following models:
- EDA \(\sim\) altitude + noise + temperature – this model achieves the lowest \(RMSE\) among all significant candidates and the lowest \(p\)-value of the functional \(F\)-test among all models. The inclusion of the non-significant (according to GET) covariate altitude improves the fit and serves to include information about the route. Overall, the model demonstrates strong explanatory power, with both a high integrated \(R^2\) and a maximum \(R^2\). Moreover, it is trained on a larger sample than the latter model.
- EDA \(\sim\) noise + temperature + body_temperature – we omit the covariate altitude from the model above and add the covariate body_temperature. This model achieves the highest maximum \(R^2\) and also integrated \(R^2\), while also being significant (\(p = 0.040\)).
We distinguish two models with the same covariates – one with (Empatica) and one with (Entrant) – because air temperature was measured by two different devices. To maximize the sample size in the models, we choose to use the Empatica measurements.
3.1 EDA \(\sim\) alt + noise + temp
The formula for this functional linear model is of the form:
\[ \text{EDA}_i(t) = \beta_0(t) + \beta_1(t) \cdot \text{Altitude}_i(t) + \beta_2(t) \cdot \text{Noise}_i(t) + \beta_3(t) \cdot \text{Temperature}_i(t) + \varepsilon_i(t). \]
Numerical characteristics for the model:
| \(RMSE\) | Maximum \(R^2\) | Integrated \(R^2\) | \(p\)-value | |
|---|---|---|---|---|
| model characteristics | 0.9806856 | 0.5026002 | 11.13673 | 0.01 |
Observed and estimated outcome trajectories, graph of the functional \(R^2\) and \(F\)-statistic:
Finally, graphs of the estimated functional regression parameters \(\widehat{\beta_j}(t), j = 1, 2, 3\) for intercept and covariates included in the model, with bootstrap CI and global envelope.
3.2 EDA \(\sim\) noise + temp + body_temp
The formula for this functional linear model is of the form:
\[ \text{EDA}_i(t) = \beta_0(t) + \beta_1(t) \cdot \text{Noise}_i(t) + \beta_2(t) \cdot \text{Temperature}_i(t) + \beta_3(t) \cdot \text{Body temperature}_i(t) + \varepsilon_i(t). \]
Numerical characteristics for the model:
| \(RMSE\) | Maximum \(R^2\) | Integrated \(R^2\) | \(p\)-value | |
|---|---|---|---|---|
| model characteristics | 0.9997611 | 0.5450801 | 14.49862 | 0.04 |
Observed and estimated outcome trajectories, graph of the functional \(R^2\) and \(F\)-statistic:
Finally, graphs of the estimated functional regression parameters \(\widehat{\beta_j}(t), j = 1, 2, 3\) for intercept and covariates included in the model, with bootstrap CI and global envelope.
4 Conclusion
To sum up the functional analysis performed for Heart Rate and Electrodermal Activity, here are the key points of the process:
- For each target functional variable (HR and EDA), we fit around 30 functional linear models.
- For each functional model, we calculate a few numerical characteristics to assess the quality of the model fit and to be able to compare models.
- From all models for HR and EDA, we filter only those that are significant (\(p\)-value \({}<0.05\)) according to the permutation \(F\)-test (Section 1.4).
- From the significant models for HR and EDA, we select a small number of models according to the highest maximum \(R^2\) (Section 1.3).
- For the selected models, we assess the significance of each functional covariate separately using the global envelope test (Section 1.5).
To sum up the results of our analysis, here is a list of key findings:
- for modelling HR, we select 4 models (Section 2):
- covariates altitude and noise have a significant effect on the value of heart rate (Section 2.1 – Section 2.4),
- covariates related to movement (speed, steps) also have a significant effect on the value of heart rate (Section 2.1, Section 2.2),
- we are able to interpret where the effect is significant (identify the time domain responsible for rejection),
- the covariate of air pollution PM25 has a non-significant effect on HR, but improves the quality of the overall fit (Section 2.1).
- for modelling EDA, we select 2 models (Section 3):
- the covariate noise has a significant effect on the values of EDA (Section 3.1, Section 3.2),
- the covariate air temperature has a significant effect on EDA in one model (Section 3.1),
- the covariate body temperature has a near-significant effect on EDA (Section 3.2),
- again, we are able to interpret where the effect is significant (identify the time domain responsible for rejection),
- the covariate altitude has a non-significant effect on EDA, but improves the fit and includes information about the route (Section 3.1).